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In accordance with the Court’s March 28, 2019 Order Setting Further Status 
Conference, ECF No. 391, Defendants hereby submit the attached Proposed Expanded M?. 
L. Class Identification Plan Summary, and supporting declarations. 


DATED: April 5, 2019 

Respectfully submitted. 


JOSEPH H. HUNT 

Assistant Attorney General 

SCOTT G. STEWART 

Deputy Assistant Attorney General 

WILLIAM C. PEACHEY 

Director 

WILLIAM C. SILVIS 

Assistant Director 

/s/ Sarah B. Fabian 

SARAH B. FABIAN 

Senior Litigation Counsel 

Office of Immigration Litigation 

Civil Division 

U.S. Department of Justice 

P.O. Box 868, Ben Franklin Station 

Washington, DC 20044 

202-532-4824 

(202) 305-7000 (facsimile) 

Sarah.B.Fabian@usdoj.gov 
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CERTIFICATE OF SERVICE 

IT IS HEREBY CERTIFIED THAT: 

I, the undersigned, am a citizen of the United States and am at least eighteen years 
of age. My business address is Box 868, Ben Franklin Station, Washington DC 20044. I 
am not a party to the above-entitled action. I have caused service of the accompanying brief 
on all counsel of record, by electronically filing the foregoing with the Clerk of the District 
Court using its ECF System, which electronically provides notice. 

I declare under penalty of perjury that the foregoing is true and correct. 

DATED: April 5, 2019 s/Sarah B. Fabian 

Sarah B. Fabian 
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PROPOSED EXPANDED MS. L CLASS IDENTIFICATION PLAN SUMMARY 


On March 8, 2019, the Court expanded the Ms. L class to include adult parents who entered the 
United States at or between ports of entry on or after July 1, 2017. The Court has also instrueted 
Defendants to put forth a potential plan for identifying the elass members within the class 
expansion period of July 1, 2017, through June 25, 2018. 

Defendants’ proposed plan to identify potential Ms. L. class members within the class expansion 
period is explained in the attached declarations from Commander Jonathan White of the United 
States Publie Health Service and Dr. Barry Graubard of the National Institutes for Health. 

In short. Defendants would identify potential Ms. L. class members by identifying their children 
out of the total population of approximately 47,000 children discharged by the Office of Refugee 
Resettlement (ORR) during the class expansion period. Defendants would attempt to streamline 
and aeeelerate identifieation of children of potential Ms. L. elass members by using programmatic 
knowledge, data analysis, and statistical science to try as best as practicable to segment the 
population based on the probability that the child’s parent is a Ms. L. class member. If successful, 
segmentation would enable Defendants to prioritize children for manual reviews of ORR case 
management records, which would confirm whether the child was, in fact, separated from a parent 
who is a Ms. L. class member for the class expansion period. 

The operational leads for the work would be; Commander Jonathan White for the U.S. Department 
of Health and Human Services (HHS), Melissa Harper for U.S. Immigration and Customs 
Enforcement (ICE), and Jay Visconti for U.S. Customs and Border Protection (CBP). They would 
convene an inter-agency Data Analysis Team. A senior biostatistician (likely Dr. Graubard from 
the NIH) would serve as the lead for the Data Analysis Team. 

Within approximately four weeks of plan activation. Defendants anticipate that the Data Analysis 
Team would conduct a regression analysis of the possible ehildren of potential class members for 
the original class period reported in the most recent Joint Status Report, ECE No. 388, using the 
approximately 12,000 ehildren who were in ORR care on June 26, 2018 as a “training set” to 
develop a prediction model. The Data Analysis Team would work to validate variables that may 
be predictive of a child having been separated from a parent {e.g ., the age of the child), and attempt 
to identify any additional demographic features of children separated from parents (as distinct from 
children who entered the United States without a parent). Through validation, the team would 
develop a prediction model eorrelating relevant variables with increased likelihood of parental 
class membership. 

Within approximately eight weeks of plan aetivation. Defendants anticipate that the Data Analysis 
team would begin using the prediction model to rank order the children among the population of 
approximately 47,000 for the class expansion period according to their probability of being 
ehildren of potential Ms. L. class members. They would then begin grouping the children into 
segments based on statistical probability of parental class membership. Using this method. 
Defendants would begin targeting manual case file review on the higher-probability groups. In 
addition, representative samples would be taken from lower-probability groups to test them. 
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As children are identified as possible children of potential My. L. class members, Defendants would 
validate their status jointly. 

Within approximately 12 weeks of plan activation, Defendants would begin consolidating 
information about any newly-identified possible child of a potential Ms. L. class member with 
information about the potential Ms. L. class member known to Defendants. Defendants would 
provide final, rolling lists to Class Counsel. The rolling lists would include basic information 
including the names and alien numbers of the children and their class member parents, and the 
parents’ last known contact information. 

Defendants estimate that identifying all possible children of potential Ms. L. class members 
referred to and discharged by ORR during the expansion period would take at least 12 months, and 
possibly up to 24 months. The time required to complete the work may be affected by at least 
three factors. The first is the efficacy of the initial prediction model and the outcomes of sampling 
of the lower-probability groups (which are not known at this juncture). The second is the pace of 
manual record review (which will depend on how many qualified contractors Defendants are able 
to hire and train for the Case File Review Team). The third factor is any meet-and-confer process 
that may occur after manual reviews for the initial, higher-probability groups are complete. 

The primary benefit of Defendants’ proposed plan is that, if successful, it would front-load the 
identification of potential Ms. L. class members and possibly lead to a reduction in the overall time 
required for manual review. For this reason, it is a more rational approach than a date-ordered or 
randomized manual review. 
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UNITED STATES DISTRICT COURT 
SOUTHERN DISTRICT OF CALIFORNIA 


MS. L., et al. Case No. 18cv428 DMS MDD 

Petitioners-Plaintiffs, Hon. Dana M. Sabraw 

vs. 

U.S. IMMIGRATION AND CUSTOMS 
ENFORCEMENT, et al., 

Respondents-Defendants. 


DECLARATION OF JONATHAN WHITE 

I, Jonathan White, declare under penalty of perjury, pursuant to 28 U.S.C. § 1746, 
that my testimony below is true and correct: 

1. I am a Commander with the United States Public Health Service 
Commissioned Corps, and have served at the Department of Health and Human Services 
(“HHS”) in three successive presidential administrations. I am presently assigned to the 
Office of the Assistant Secretary for Preparedness and Response (“ASPR”), and previously 
served as the Deputy Director of the Office of Refugee Resettlement (“ORR”). 

2. The statements in this declaration are based on my personal knowledge, 
information acquired by me in the course of performing my official duties, information 
supplied to me by federal government employees, and government records. 

3. I am providing this declaration for use by the Defendants and the Court in Ms. 
L. V. ICE, No. 18-CV-428 (S.D. Cal.). 

Background and Recommended Methodology 

4. My understanding is that on March 8, 2019, this Court expanded the class in 
Ms. L. The class is now defined as: “All adult parents who entered the United States at or 
between designated ports of entry on or after July 1, 2017, who (1) have been, are, or will 
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be detained in immigration custody by the DHS, and (2) have a minor child who has been, 
is or will be separated from them by DHS and has been, is or will be detained in ORR 
custody, ORR foster care, or DHS custody, absent a determination that the parent is unfit 
or presents a danger to the child.” ECF No. 386. The same qualifications apply to the 
original and expanded classes. “[T]he class does not include migrant parents with criminal 
history or communicable disease, or those who are in the interior of the United States or 
subject to the EO.” ECF No. 82. 

5. The Defendants have previously identified the children of potential Ms. L. class 
members who were in the care of ORR on June 26, 2018. As I have previously explained, 
the process of identifying those children involved analysis of dozens of data sets from U.S. 
Customs and Border Protection (CBP) and U.S. Immigration and Customs Enforcement 
(ICE), manual review of approximately 12,000 individualized ORR case management 
records, and reconciliation with sworn testimony from the ORR grantees caring for the 
children. ECF No. 347-1. Ultimately, this process “was operationally feasible because the 
children were still in ORR custody, and ORR grantees were able to talk with them about 
separation and share the information with HHS.” Id. 

6. HHS cannot use the exact same methodology to identify the children of 
potential class members for the class expansion period of July 1, 2017 through June 25, 
2018 for three reasons. First, ORR has discharged the children in its care during the class 
expansion period, and thus lacks access to those children through grantees. Second, my 
current understanding is that CBP is likely not able to produce data sets for the time period 
before April 19,2018, as CBP did not track parental separation data as a separate searchable 
data point prior to that time. Third, the sheer number of ORR case management records, 
covering approximately 47,000 children referred to and discharged by ORR during the class 
expansion period, would overwhelm ORR’s existing resources were it to attempt a manual 
review of all records in date order. See Deck of Jallyn Sualog, ECF No. 347-2. 

7. I have therefore sought to develop a methodology to try as best as practicable 
to streamline and accelerate the identification of potential Ms. L. class members in the class 
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expansion population by first identifying their children. To that end, I have consulted with 
Barry Graubard, Ph.D., who is a senior biostatistician for the National Institutes of Health 
(NIH), National Cancer Institute, Division of Cancer Epidemiology & Genetics, 
Biostatistics Branch. NIH is an operating division of the U.S. Department of Health anc 
Human Services (HHS). 

8. Dr. Graubard has recommended pursuing a methodology that combines 
statistical analysis and manual review of ORR case management records. His 
recommendation is set forth in his declaration, which is attached as Exhibit A to the 
Proposed Expanded Ms. L. Class Identification Plan. In my testimony below, I explain how 
Defendants, based on the information known to them today, would likely implement Dr. 
Graubard’s recommendation. I would serve as the HHS Operational Lead for Reunification 
for the implementation. 

Plan for Implementing Recommended Methodology 

9. To implement Dr. Graubard’s recommended methodology. Defendants would 
likely need to perform approximately 12 weeks of intensive data analysis before starting 
manual reviews. That is. Defendants would likely need 12 weeks to format the data, 
perform a regression analysis, and build a prediction model to segment and prioritize 
manual reviews of ORR case management records for the approximately 47,000 possible 
children of potential Ms. L. class members for the class expansion period. This approach 
would involve a series of steps, outlined below, that would be informed in real time by the 
data and would likely evolve as implementation progresses and the Defendants refine 
methods based on lessons learned. 

Within Approximately 4 Weeks of Plan Activation 

10. HHS would first prepare a data set encompassing all children referred to ORR 
starting July 1, 2017, and discharged from ORR care prior to June 26, 2018.^ I understand 


’ It is possible that some children referred to ORR care in early July 2017 would 
rave entered the United States before July 1, 2017. Such children would not be potential 
children of possible Ms. L. class members. 
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that set to include approximately 47,000 children. See ECF No. 347-1. 

11. Defendants would then convene a Data Analysis Team, reporting to the HHS 
CBP, and ICE Operational Leads for Reunification, to conduct statistical analyses of the 
data set. A senior biostatistician (likely Dr. Graubard of the NIH) would serve as the Data 
Analysis Team lead, reporting directly to the Operational Leads for Reunification. 

12. The Data Analysis Team would conduct a regression analysis of the possible 
children of potential class members reported in the most recent Joint Status Report, ECF 
No. 388, using the approximately 12,000 children who were in ORR care on June 26, 2018 
as a “training set” to develop a prediction model. The Data Analysis Team would work to 
validate variables that may be predictive of a child having been separated from a parent 
{e.g., the age of the child), and attempt to identify any additional demographic features of 
children separated from parents (as distinct from children who entered the United States 
without a parent). Through validation, the team would develop a prediction model 
correlating relevant variables with increased likelihood of parental separation. 

13. We expect that the data will inform the development of the prediction model, 
which will evolve in an iterative, stepwise manner. During the process, the Data Analysis 
Team may request additional data from HHS, CBP, or ICE as appropriate. 

Within Approximately 8 Weeks of Plan Activation 

14. Once the Data Analysis Team lead determines that an initial version of the 
prediction model is sufficient for use, the Data Analysis Team will apply it to the 
approximately 47,000 children for the class expansion period, and rank order children 
according to their probability of being children of potential Ms. L. class members. 

15. The Data Analysis Team would then stratify the approximately 47,000 children 
for the class expansion period into “bands” or “segments” based on statistical probability of 
aarental class membership. The Defendants would prioritize the highest-probability 
segments for manual review of ORR case management records and any other relevant 
nformation. 
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16. Defendants would build and launch a team of contracted administrative staff 
to conduct manual reviews of ORR case management records, which are maintained on the 
UAC Portal. This “Case File Review Team” would follow review protocols informed by 
the work conducted during the 2018 reunification. They would report to the HHS 
Operational Lead (who would work with the ORR career staff to train them). 

17. Once the manual review of the highest-probability segments begins, the Case 
File Review Team would begin preparing draft lists of possible children of potential Ms. L. 
class members and providing them to Defendants on a rolling, weekly basis. HHS, CBP, 
and ICE would review and validate these lists jointly. 

18. While the Case File Review Team conducts manual review of the highest- 
probability segments of children, the Data Analysis Team would conduct statistical 
sampling of the lower-probability bands. The Case File Review Team would test the 
samples through blind, manual review to enable the Data Analysis Team to determine 
whether the sample contains any children of potential Ms. L. class members. The outcome 
of the sampling process may result in adjustments to the variables, prediction model, or 
segments. It may also inform the approach to manual case file review of the lower- 
probability bands. If, for example, the samples yield no children of potential Ms. L. class 
members, then it may become appropriate for the parties to meet and confer on further 
streamlining. 

Within Approximately 12 Weeks of Plan Activation 

19. HHS would review the discharge type and sponsor information in the UAC 
Portal to determine: (i) the type of discharge that resulted in the child exiting ORR care; 
(ii) whether a potential Ms. L. class member is the child’s sponsor of record; and (iii) the 
name, address, and relationship of the sponsor for each child of a potential Ms. L. class 
member who was discharged to an individual sponsor. 

20. Defendants would consolidate the HHS and DHS information into final, rolling 
ists, which DOJ would provide to Class Counsel. Where available, the rolling lists would 
include the names and alien identification numbers for both children and their class member 
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parents; their dates of apprehension; the dates children were referred to and discharged from 
ORR care, and the type of discharge; parent detention status; and last known parent contact 
information. 

Total Time for Completion 

21. Jallyn Sualog, the Deputy Director for Children’s Programs for ORR, testified 
previously that it would likely take between 235 and 471 consecutive calendar days for 100 
ORR analysts to manually review the ORR case management records for the approximately 
47,000 children in ORR care during the class expansion period. If Defendants were able to 
hire qualified contractors, then I expect it would take at least the same number of 
consecutive calendar days to perform the same work on a date-ordered or randomized 
manual file review. 

22. The goal of pursuing Dr. Graubard’s recommended methodology is to identify 
children of potential Ms. L. class members in the class expansion population in a faster and 
more concentrated way than would occur through a date-ordered or randomized manual file 
review. The application of the methodology in this context is novel. 

23. The time for completing the process using Dr. Graubard’s recommended 
methodology—including manual review of ORR case management records prioritized 
through probabilistic segmentation—^may vary for at least three reasons. First, the efficacy 
of the initial prediction model, and the outcomes of the sampling of the lower-probability 
segments, are not known at this juncture. They are likely to drive the time for completion. 
Second, the pace of the prioritized manual review will depend on the number of qualified 
contractors that Defendants are able to identify and retain for the Case File Review Team, 
as well as the speed with which Defendants are able to scale up the team, and the efficiencies 
that may or may not materialize from having a dedicated group of professionals manually 
reviewing case files over a period of months. Third, any meet-and-confer process that 
occurs after completion of the sampling phase could affect the time for completion. Many 
of these considerations are outside Defendants’ control. 
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24. Given the complexity of the task and the variables and data known to 
Defendants at this time, a reasonable assumption is that it will take at least 12 months, and 
possibly up to 24 months, for Defendants to complete the process of identifying potential 
Ms. L. class members in the class expansion population through universal manual review. 
The primary benefit of pursuing Dr. Graubard’s recommended methodology is that, if 
successful, it would front-load the identification of potential Ms. L. class members and 
possibly lead to a reduction in the overall time required for manual review. 


Executed on April 5, 2019. 
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UNITED STATES DISTRICT COURT 
SOUTHERN DISTRICT OF CALIFORNIA 


MS. L., et al. 


Case No. I8cv428 DMS MDD 


Petitioners-Plaintiffs, 


Hon. Dana M. Sabraw 


vs. 

U.S. IMMIGRATION AND CUSTOMS 
ENFORCEMENT, et al, 

Respondents-Defendants. 


DECLARATION OF BARRY GRAUBARD 

I, Barry I. Graubard, declare under penalty of perjury, pursuant to 28 U.S.C. § 1746, 
that my testimony below is true and correct: 

1. I am a Senior Investigator in the Biostatistics Branch of the National Cancer 
Institute. See https://dceg.cancer.gov/about/staff-directorv/biographies/A-J/graubard-barrv 
(last visited April 5, 2019). A copy of my curriculum vitae is attached as Exhibit 1. 

2. I have more than 40 years of experience conducting statistical methods 
research in biostatistics and survey sampling, and in collaborating with scientists on 
research in cancer epidemiology and other areas of epidemiology and public health. For 
example, I recently performed modeling to estimate the one-year probability that an 
individual would get oropharyngeal cancer based on various risk factors. The paper 
reporting this work has been submitted for publication to a peer-reviewed journal. The 
statistical techniques used in this study were regression modeling and cross validation. 

3. I have also used other regression methods such as Cox proportional hazard 
regression to predict length of survival (e.g., among liver transplant recipients based on 
patient characteristics and clinical risk factors). 
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4. The statements in this declaration are based on my personal knowledge, 
information acquired by me in the course of performing my official duties, information 
supplied to me by federal government employees, and government records. 

5. lam making this declaration for use in Ms. L. v. U.S. Immigration and Customs 
Enforcement, No. 18cv428 (S.D. Cal.). 

6. I understand that on March 8, 2019, the Court in Ms. L. modified the class 
definition. The class now includes: “All adult parents who entered the United States at or 
between designated ports of entry on or after July 1, 2017, who (1) have been, are, or will 
be detained in immigration custody by the DHS, and (2) have a minor child who has been, 
is or will be separated from them by DHS and has been, is or will be detained in ORR 
custody, ORR foster care, or DHS custody, absent a determination that the parent is unfit 
or presents a danger to the child.” ECF No. 386.1 further understand that the modified class 
is subject to the same qualifications as the original certified class, and that as a result, it is 
still the case that “the class does not include migrant parents with criminal history or 
communicable disease, or those who are in the interior of the United States or subject to the 
EO.” ECF No. 82. 

7. Commander Jonathan White of the United States Public Health Services has 
asked me to recommend a statistical methodology to try to streamline and accelerate the 
identification of the children of Ms. L. class members who were referred to and discharged 
by ORR during the class expansion period of July 1, 2017 through June 25, 2018, and to 
advise an inter-agency Data Analysis Team that would seek to implement the methodology. 
My understanding is that approximately 47,000 alien children were referred to and 
discharged by ORR during that period. An optimal statistical methodology would enable 
ORR to prioritize manual record reviews for the approximately 47,000 children based on 
the probability that the child’s parent is a Ms. L. class member. 

8. I will refer to the approximately 47,000 children who were referred to and 
discharged by ORR during the class expansion period of July 1, 2017 and June 25, 2018 as 
the “test set.” 
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9. I will apply two assumptions to promote an inclusive and through review. 
First, I will assume that any alien child who was apprehended by the U.S. Department of 
Homeland Security (DHS) at the southern border together with a parent, and who was 
referred to ORR care by DHS, was possibly separated from the parent by the federal 
government. Second, I will assume that any alien child who was referred to and discharged 
by ORR during the class expansion period is a child of a potential Ms. L. class 
member. These assumptions can be expected to include many children who were not 
separated from their parents, but will promote a thorough review. 

10. Based on these assumptions, I recommend using an empirically-determined 
model to try to predict the probability for each child that a parent accompanied the child 
before he or she was referred to ORR care. These probabilities would be used to group 
children from the test set into strata based on the probability that a parent is a potential class 
member. A separate Case File Review Team would then review the ORR case management 
records for the children in the test set. The records of the children in the strata with the 
highest probabilities would be reviewed before strata with lower probabilities, thereby 
identifying more children of class members in the test set in a speedier fashion. 

11. I recommend that the Data Analysis Team seek to develop a prediction model 
by analyzing data for the approximately 12,000 children in ORR care as of June 26, 2018 
(the “training set”). I understand that at this point, the government knows which children 
in the training set were children of potential Ms. L. class members. See Joint Status Report, 
ECF No. 388. By analyzing the data associated with these children, the Data Analysis Team 
would seek to identify common independent variables that together would provide a 
framework for rank ordering other children by the likelihood that their parent is a Ms. L. 
class member. The list of potentially relevant independent variables would include: 

• Child age, because tender-age and young children are more dependent on parents 
than older children, and may therefore be more likely to travel with parents than with 
other adults or children; 
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• The referring U.S. Customs and Border Protection (“CBP”) Sector, because I 
understand that at least one CBP sector is alleged to have conducted a pilot program 
involving increased rates of referrals for prosecutions of immigration law violations; 

• Sibling information, because younger children who are not in sibling groups may 
have a higher probability of having been separated than younger children 
accompanied by older siblings; 

• ORR discharge type, because discharge to a family member other than a parent, or 
discharge type other than release to an individual sponsor, might correlate with a 
higher probability of a child having been separated from a parent; 

• Appearance of the word “separated” or “separation” in text box data fields on the 
ORR Portal corresponding with either the initial assessment of the child or a 
Significant Incident Report; and 

• Inclusion on any informal tracking list of separated children that ORR created during 
the class expansion period. 

12. To develop a prediction model, the Data Analysis Team would analyze the 
training set data with statistical analysis software. If the software proposes multiple models, 
then the Data Analysis Team would apply a statistical method known as cross validation to 
identify the most appropriate model to predict parental class membership within a given 
subset of the training set. 

13. Once the most appropriate model is identified, the Data Analysis Team would 
try to apply it to the available data for the test set of approximately 47,000 children referred 
to and discharged by ORR between July 1, 2017 and June 25, 2018. By applying the 
predictive model to the test set, the Data Analysis Team would identify the children in the 
test set who are more likely to have parents who are Ms. L. class members. As noted above, 
the use of the model in this way would enable the Data Analysis Team to organize the test 
set into strata according to increasing probability of parental class membership, to prioritize 
manual case file review. 
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14. As the Data Analysis Team applies the prediction model to the test set, the 
process may result in refinements to the model and segments themselves. For example, if 
the Case File Review Team positively identifies children of potential M?. L. class members 


within a lower-probability band of the test set, this may result in the Data Analysis Team 


updating the variables it considers as part of its model. 

15. The feasibility of this statistical method may turn on the availability, format, 
and comprehensiveness of the data for the children. Assuming, however, that the data is 
sufficient, the statistical method that I have described is a more rational approach than a 
date-ordered or randomized manual record review of the test set. If successful, it would 
front-load the identification of potential Ms. L. class members. It is possible that it could 
also reduce the overall time required for manual review. 

Executed on April 5, 2019. 



Barry I. Graubard 
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CURRICULUM VITAE 


Name; Barry Ira Graubard January 15, 2019 

Work Address: Biostatistics Branch 

Division of Cancer Epidemiology and Geneties, 

National Cancer Institute 

9609 Medical Center Drive RM 7-El40 MSC 9780 

Bethesda, MD 20892-7354 

Phone: (240) 276-7316; Eax: 240-276-7838 

E-mail;graubarb@mail.nih.gov 

Citizenship: United States 


Education; 

1968 High School Graduation, Groveton High School, Alexandria, VA 

1968-1970 (68 Semester Hours, Major: Chemistry and Mathematics) Rensselaer 

Polytechnic Institute, Troy, New York 

1972 B.S. (Major in Mathematics, Minor in Physics) University of Maryland, 
College Park, MD 

1974 M.A. (Mathematics, Area: Statistics and Probability) Department of 
Mathematies University of Maryland, College Park, MD 

1991 Ph.D. (Mathematical Statistics) Department of Mathematics, University 
of Maryland, College Park, MD 


Other Training; 


1977-1979 (12 Semester Hours) Survey Sampling and Biostatisties, George 
Washington University, Washington, DC 


Employment; 


1972-1976 

1977-1980 

1980- 1981 

1981- 1989 


Graduate teaehing assistant in the Department of Mathematics, 
University of Maryland at College Park 
Mathematieal Statistician, National Center for Health Statistics 
Mathematical Statistician, Alcohol Drug Abuse and Mental Health 
Administration 

Mathematieal Statistician, National Institute of Child Health and Human 
Development, Biometry Branch 
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1989-1996 

1996- 1997 

1997- 1999 
2001-2002 
1997-pres 


Senior Researcher, National Cancer Institute, Biometry Branch, 

Clinical and Diagnostic Trials Section 

Acting Chief Biostatistical Methodology and Cancer Control Section, 
National Cancer Institute, Biometry Branch 

Senior Associate, Department of Biostatistics, Johns Hopkins University, 
taught a semester course “Analysis of Health Surveys” 

Guest Lecturer, Department of Mathematics, University Maryland, 
taught a one semester workshop entitled “Analysis of Health Surveys” 
Senior Investigator, Title 42, National Cancer Institute, Biostatistics 
Branch. 


Membership in Professional Societies: 

1977-pres American Statistical Association 
1977-pres Washington Statistical Society 

1980-pres International Biometric Society Eastern North American Region (ENAR) 
2010-pres American Association for the Advancement of Science 


Selected Committee and Board Membership: 


1988-1990 ENAR Biometrics Society Regional Advisory Board 

1991-1994 Washington Statistical Society Public Health and Biostatistics Program 
Chair 


1994-1995 American Statistical Association Biometrics Section Program Chair 

1994-1997 American Statistical Association Continuing Education Advisory 
Committee 


1994 


American Statistical Association ad hoc committee to review candidates 
for travel awards to 50*'' Session of the International Statistical Institute, 


1995 


1997- 2001 American Statistical Association, Survey Methods Research Section, 

Chair, Continuing Education Committee 

1998- 2001 ENAR Biometrics Society Regional Committee 
1998-1999 NCI Surveillance Implementation Group 
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1999-01 Ad hoc ENAR Biometrics Society Membership Committee 

1999- pres Federal Committee on Statistical Methodology 

2000 Chair of Search Committee for tenure traek / tenure researeh mathematical 
statistician, Biometry and Mathematical Statistics Branch, National 
Institute of Child Health and Human Development, NIH 

2000- pres Program Committee for Federal Committee on Statistical Methods 

Research Conference 

2001- 02 Program Committee for ENAR Biometries Society 2002 Spring Meeting 

2001 Chair of Search Committee for tenure track / tenure researeh mathematical 
statistician. Biometry and Mathematical Statistics Branch, National 
Institute of Child Health and Human Development, NIH 

2001-04 United Nations Committee and contributor to UN Technical Report on the 
Analysis of Operating Characteristics of Surveys in Developing Countries. 

2003-07 Editorial Board of the JNCl Cancer Spectrum 

2004 Member of the National Children's Study Sampling Design Workshop, 
March 21-22. 

2004 Institute of Medicine Workshop on Estimating the Contribution of Lifestyle- 
Related Faetors to Preventable Death Dee. 13-14; presented “Caleulating the 
number of deaths attributable to risk factor using national survey data.” 

2005-06 Co-Program Chair of Section on General Methodology, Ameriean 
Statistieal Assoeiation, 2006 Joint Statistical Meetings 

2005- 10 Advisory Board for the University of Minnesota Integrated Health Interview 

Series Project 

2005 Expert Advisory Group to advise Harvard U on statistical methods for 
combining data from multiple surveys for developing measures of the 
diffusion and use of health information technology 

2006- 08 ENAR Education Advisory Committee 

2007- 09 Chair of the Ameriean Statistical Association Committee on the Award of 

Outstanding Statistical Application 

2007-08 Chair of the Division of Cancer Epidemiology and Genetics Committee on 
Scientists 
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2009 

Chair Elect of the Biometric Section, American Statistical Association 

2009 

Member Expert Panel on the Redesign of the National Crime Victimization 
Survey 

2009-10 

DCEG Technical Evaluation of Protocols Committee 

2009-10 

Member of Selection Committee for Committee of Presidents Statistical 
Societies (COPSS) Snedecor Award 

2010-11 

Chair, Selection Committee for COPSS Snedecor Award 

2009-10 

Member of Selection Committee for Biometrics Section, American 
Statistical Association, David P Byar Award 

2011 

Chair Selection Committee for Biometrics Section, American 

Statistical Association, David P Byar Award 

2009-10 

DCEG Technical Evaluation of Protocols Committee 

2011 

Chair of Search Committee for tenure track / tenure research 
biostatistician/ statistician. Radiation Epidemiology Branch, NCI 


2011-2012 DCEG Technical Evaluation of Protocols Committee 
2011-pres Member of DCEG Promotion and Tenure Review Panel 


2013 

Reviewer for the American Statistical Association, National Science 
Eoundation and Bureau of Eabor Statistics Eellowship Program 
httD://www.amstat.org/careers/Ddfs/ASANSPBESPellowshir)Program.Ddf 

2013 

Reviewer for proposal to the Euxembourg National Research Eund (ENR) 
INTER MOBIEITY programme. 

2014-17 

Washington Statistical Society Morris Hansen Eecture Committee 

2014-15 

Member of the Committee of Presidents of Statistical Societies (COPSS) 
Elizabeth E. Scott Award Committee 

2016-17 

Chair, Committee of Presidents of Statistical Societies (COPSS) 

Elizabeth E. Scott Award Committee 

2014-17 

Committee of Representatives to American Association for the 

Advancement of Science (AAAS) 

2015 

Patient-Centered Outcomes Research Institute (PCORI) Obesity 
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Observational Research Initiative Merit Review Panel 

2017 Panel member of FDA Public Workshop on Abuse-Deterrent Opioids in 
Silver Spring, Md, July 10-11, 2017 

2018-20 American Statistical Association Committee on Fellows 


Editorial Boards 

1997-pres Statistical Editor, Journal of the National Cancer Institute 
2008-14 Editorial Board ASA/SIAM Book Series 
2008-pres Associate Editor, Annals of Applied Statistics 


Selected Lectures and Presentations: 

1993 Invited Presentation, The Biometric Society-ENAR Spring Meetings, 
Philadelphia, PA, Statistical Validation of Intermediate Endpoints for Chronic 
Diseases.” 

1994 Invited Presentation, The Drug Information Association, Washington, DC, 
“Regression Analysis of Clustered Data.” 

1995 Invited Presentation, The Joint Statistical Meetings of the American Statistical 
Association, Orlando, FL, “Analysis of Population Based Case-Control Studies 
with Controls Selected from a Survey.” 

1996 Invited Presentation, Bureau of Medical Devices, Food and Drug 
Administration, “Analysis of Clustered Data.” 

1997 Invited Presentation, Department of Mathematics, University of Maryland, 
“Variance Estimation for Superpopulation Parameters” 

1999 Invited Presentation, Department of Statistics, Texas A&M University, 
Variance Estimation for Superpopulation Parameters. 

1994 Invited Lecturer Cancer Prevention and Control Fellowship Course, NCI, 

-2006 “Analyzing Health Surveys: Accounting for the Sample Design.” 

2000 Keynote Speaker, The 2000 Statistical Science Awards Ceremony, Centers for 
Disease Control and Prevention, Atlanta, GA, “Statistical Issues in Analyzing 
Health Surveys: Applications to Cancer Studies.” 
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2001 Invited lecturer at the University of Maryland, Department of Mathematics, 
College Park, to teach fall semester workshop “Analysis of Health Survey Data” 
(Course: STAT 798A section 0104) ; meets one day a week for 1.5 hours, 

2002 Invited presentation Joint Statistical Meetings, “Issues in Design-based 
Weighted Analysis of Survey Data” 

2002 Invited 1-day course “Analysis of Complex Survey Data with Applications 

to Health Surveys” for the Statistics Canada 2002 Methodology Symposium on 
Modeling Survey Data for Social and Economic Research 

2003 Invited tutorial at 2003 Spring ENAR Meeting: “Sample Survey Methods for 
Biostatisticians” 

2003 Invited discussant at 2003 Spring ENAR Meeting “Sampling methods for 
selecting population controls” 

2003 Invited speaker at Westat methodology seminar ‘Estimating of Variance 
Components using Survey Data.” 

2004 Invited Short Course at Eleventh Annual Spring Research Conference, 

“Analysis of Complex Surveys.” 

2004 Invited presentation Joint Statistical Meetings, “Development of statistical 
methods to analyze complex health surveys for epidemiologic studies: Some 
methods and applications.” 

2004 Invited presentation at Harvard University School of Public Health, “Analyzing 
Survey Data: Estimation of population attributable risk and population variance 
components.” 

2004 Invited Discussant for Distinguished Lecture by Chris Skinner for Joint 
Program in Survey Methodology, University of Maryland, “Other Issues in 
Modeling Survey Data.” 

2005 Invited presentation University of Maryland School of Medicine, Baltimore, 
“Statistical issues in analyzing health surveys: application to cancer and 
mortality studies.” 

2005 Invited Discussant for Distinguished Lecture by Alastair Scott for Joint Program 
in Survey Methodology, University of Maryland, “Discussion of population- 
based case-control studies.” 

2006 Invited presentation Spring ENAR Meeting, Tampa, Ela. “Using national 
surveys to estimate the number of deaths attributable to a risk factor” 
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2006 Special Contributed Panel Session presentation Joint Statistical Meetings, 
Seattle, WA, “Finite population vs. superpopulation inference in sample 
surveys: How big is the difference?“ 

2006 Invited presentation Statistics Canada Symposium 2006, Ottawa, Canada, 

“Using national surveys to estimate the number of deaths attributable to a risk 
factor” 

2006 Invited short course for the International Biometrics Conference, Montreal, 
Canada, “Analysis of Health Survey: Sample Survey Methods for 
Biostatisticians” 

2007 Invited panel member of “Role of biostatisticians in policy issues” for the 
Spring ENAR Meeting, Atlanta, GA. 

2007 Invited presentation at Mathematica, “To weight or not to weight” 

2008 Invited presentation Joint Statistical Meetings, Denver, Colorado “Application 
of Peters-Belson to estimation of disparities.” 

2009 Invited presentation for Conference in Honor of Joseph Gastwirth, George 
Washington University, Washington, DC, “ The use of the risk percentile curve 
in the analysis of epidemiologic data.” 

2009 Invited presentation for Joint Statistical Meetings, Washington, DC, “Use of 
Statistics at the Centers for Disease Control and Prevention and National 
Cancer Institute: Estimation of the numbers of all-cause and cause-specific 
deaths associated with body weight.” 

2011 Invited presentation Department of Statistics, George Washington 
University,“Conditional logistic regression with survey data.” 

2011 Invited presentation National Center for Health Statistics, “Conditional logistic 
regression with survey data.” 

2013 Invited presentation National Institute of Environmental Health Sciences, 
“Conditional logistic regression with survey data.” 

2013 Invited presentation. Scholars Summer at Census, US Census Bureau, 
“Conditional logistic regression with survey data.” 

2013 Invited presentation for Eall Outreach Symposium for the International Year of 
Statistics at the Bureau of Eabor Statistics, “Estimating sibling recurrence risk 
in population sample surveys.” 

2014 Invited presentation Statistical Society of Canada 2014 Annual Meeting, 
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Toronto, “Estimating sibling recurrence risk in population sample surveys.” 

2018 Invited presentation at the 2018 Joint Statistical Meetings, “Population-Based 
Disease Risk Prediction Modeling Using National Survey, Clinical, and 
Registry Data; Application to Risk Prediction for Oropharyngeal Cancer in the 
US Population.” Vancouver, Canada 

2018 Invited Talk George Washington University, School of Public Health, 

“Statistical and Epidemiological Challenges in Utilizing the National Health and 
Nutrition Examination Survey (NHANES) Assessment of Oral Human 
Papillomavirus (HPV) Infection to Study Risk of HPV Infection and of 
Oropharyngeal Cancer in the US.”, Washington, DC 

Recent Grants 

Unpaid Collaborator 

“Trends in Socioeconomic Position and Diet Relationship ” CAl 08274 PI; Kant, 

Ashima, Queens College, NY July, 2004 to June, 2007 

Unpaid Collaborator 

“SNP-based pseudo-semiparametric inference for the case-control studies ” NIH- 

UOl CAl 59424, National Institutes of Health PI; Ei, Yan, University of Maryland, 

College Park, MD, September, 2011 to August, 2013. 

Unpaid Collaborator 

“Semiparametric inference for case-control studies with complex sampling” NIH 

8513069, National Institutes of Health PI; Li, Yan University of Maryland, College Park, 

MD, September 24, 2013 to August 31, 2014. 

Teaching Experience: 

1972-76 Graduate Teaching Assistant - Conducted recitation classes for undergraduate 
courses in college algebra, calculus, linear algebra, and was a lecturer for 
introductory statistics course (STAT 100) for non-mathematics majors. 

1980 Lecturer for a one semester undergraduate course in elementary probability and 
Stochastic processes for non-mathematics majors in Department of Mathematics, 
University of Maryland. 

1997 Adjunct Professor at Johns Hopkins University Department of Biostatistics 
where I taught a one semester graduate course entitled “Analysis of Health 
Survey Data” 

2001 Invited lecturer at the University of Maryland, Department of Mathematics, 

College Park, to teach fall semester workshop “Analysis of Health Survey Data” 
(Course; STAT 798A section 0104); met one day a week for about 1.5 hours. 
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2002 Invited 1-day course “Analysis of Complex Survey Data with Applications 

to Health Surveys” for the Statistics Canada 2002 Methodology Symposium on 
Modeling Survey Data for Social and Economic Research. 

2003 Invited tutorial at 2003 Spring ENAR Meeting: “Sample Survey Methods for 
Biostatisticians” 

2004 Invited Short Course at Eleventh Annual Spring Research Conference, 
“Analysis of Complex Surveys.” 

2006 Invited short course for the International Biometrics Conference, Montreal, 
Canada, “Analysis of Health Survey: Sample Survey Methods for 
Biostatisticians” 

2015 Co-taught “Statistical Methods for Analysis of Complex Samples in 

Public Health” at University of Maryland, College Park, MD, course number 
SURV 699N for the Joint Program in Survey Methods. 


Primary Mentor: 

NCI Post-Doctoral Eellows: 

Dr. Sowmya R Rao, 2002-2004, presently Associate Professor at the University of 
Massachusetts Medical School, Worchester, MA and Senior Statisticians in the Center 
for Health Quality, Outcomes and Economic Research (CHQOER) in the Veterans 
Administration Health Services Research and Development Service 

Dr. Yan Li, 2006-2008, presently Associate Professor at the Joint Program of Survey 
Methods, University of Maryland, College park, MD 

Dr. Sonya Heltshe 2008-2009, presently Assistant Professor and Senior Statistician at 
Seattle Children's Hospital, Seattle WA Center for Clinical and Translational Research 

Dr. Victoria Landsman 2009-2011, presently Scientist & Biostatistician at Institute for 
Work and Health and Adjunct Professor at University of Toronto, Assistant Professor. 

Dr. Orestis Panagiotou 2015-2016, presently Assistant Professor of Health Services, 
Policy and Practice (Research) at Brown University. 

Dr. Noorie Hyun 2016-2017, presently Assistant Professor, Medical College of 
Wisconsin, Institute for Health & Equity, Division: Biostatistics Program 

Dr. Marlena Maziarz 2017-2018, presently Assistant Professor, Lund University, 
Sweden. 
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Dr. Gregory Haber 2018- present. 


Co-Advisor for Ph.D, Candidates: 

Blossom H Patterson, Doctoral Dissertation (1998): “Latent Class Analysis of Sample 
Surveys,” College of Education, Department of Measurement and Statistics, University 
of Maryland. 

Dewei She, Doctoral Dissertation (2010): “Genetic Association Studies Using Complex 
Survey Data,” Department of Statistics, George Washington University. 

Wenliang Yao, Doctoral Dissertation (2012): “Estimation of ROC Curve with Complex 
Survey Data”, Department of Biostatistics, George Washington University. 

Cong Wang, Doctoral Dissertation (2017): “Analysis of Eamilial Aggregation Using 
Recurrence Risk for Complex Survey Data”, Department of Statistics, George 
Washington University. 

April D. Kidd, Doctoral Dissertation (2017): “Mammography Utilization in African 
American Women”, School of Nursing, Duquesne University. 

Eingxiao Wang, Doctoral Dissertation (currently). Topic: Making cohort studies 
representative of the US population using weighting methods. Dept, of Joint Program of 
Survey Methodology, University of Maryland 

Yan Liu, Doctoral Dissertation (currently). Topic: Generalized Score Test for Complex 
Sample Data, Dept, of Statistics, George Washington University. 

Ph.D, Dissertation Committees: 

Dr. Blossom H Patterson, Dept, of Measurement, Statistics and Evaluation, University of 
Maryland, College Park 

Dr. Tara Vogt, Dept Epidemiology, Yale University 

Dr. Steven Moore, Dept Epidemiology, Yale University 

Dr. Leah M Eerrucci, Dept Epidemiology, Yale University 

Dr. Jianzhu Ei, Dept. JPSM, University of Maryland, College Park. 

Dr. Santanu Pramanik, JPSM, University of Maryland, College Park. 

Dr. Hiroyuki Hikawa, Dept, of Statistics, George Washington University 

Dr. Wenliang Yao, Dept, of Biostatistics, George Washington University 

Dr. Cong Wang, Dept, of Statistics, George Washington University. Title: Analysis of 

Eamilial Aggregation using Recurrence Risk for Complex Survey Data. 10/2017 

Dr. April D. Kidd, School of Nursing, Duquesne University. Title: Mammography 

Utilization in African American Women. 11/2017 
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Dr. Xia Li, Dept. Mathematics, University of Maryland, College Park. Title; Misspecified 
Weights in Weight-Smoothing Methods. 1/2018 

Research Interests: 

Design and Analysis of Complex Surveys and Epidemiologic Studies 
Statistical Methods for Design and Analysis of Epidemiological Studies 
Analysis and Design of Cluster Randomized/Community Studies and Nonrandomized 
Evaluation Studies 

Classification and Discriminant Analysis 
Population Genetics and Genetic Epidemiology 

Reviewer for Selected Journals; 

American Journal of Clinical Nutrition 
American Journal of Epidemiology 
American Journal of Public Health 
Annals of Applied Statistics 
Biometrics 
Biometrika 

Controlled Clinical Trials 
Epidemiology 

Journal of the American Statistical Association 

Journal of the American Medical Association 

Journal of Clinical Epidemiology 

Journal of the National Cancer Institute 

Statistics in Medicine 

Survey Methodology 

Journal of Official Statistics 

Journal of the National Cancer Institute 

Journal of the American Medical Association 

New England Journal of Medicine 

Honors and Awards: 

1987 Quality Step Award, NICHD 

1990 Snedecor Award - Presented by the American Statistical 
Association and the Biometric Society 

1999 NCI Special Service Award of $5,000 for statistical leadership on the 
ASSIST Evaluation 

1999 NIH Merit Award for fundamental contributions to statistical methods for 
survey studies, and exemplary collaborations in the analysis and 
interpretation of survey data. 

2000 NIH Merit Award for extraordinary efforts in developing a conceptual 
framework and evaluation design for the American Stop Smoking 
Intervention Study (ASSIST) 
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2000 Elected Fellow of the American Statistical Association 

2001 Division of Cancer Epidemiology and Genetics, NCI Mentor of the Year 
Award 

2004 NIH Merit Award for consistent and high-quality effort work on the 
National Health Interview Survey and the California Health Interview 
Survey 

2006 Charles C Shepard Science Award for Assessment and Epidemiology 
presented for scientific excellence by the publication of Excess deaths 
associated with underweight, overweight, and obesity, JAMA 2005; 
293:1861-1867. 

2009 NIH Merit Award for excellence in the measurement, analysis, and release 
of nationally representative data concerning serum biomarkers from the 
insulin-like growth factor axis. 

2010 NCI Mentor of Merit Award for excellence in mentoring post and pre- 
doctoral fellows 

2013 AAAS Fellow of Statistics Section 

2015 NCI Group Merit Award: NCI Select Agents and Hazardous Biological 
Materials Search 

2018 NCI Mentor Award 
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